Storage and Backup
Student NotesModule: Operating Systems 3 (Virtualisation & Cloud Technologies)
Topic: Storage and Backup
Estimated Reading Time: 25 Minutes
Welcome to Week 4!
You have learned how to create virtual machines (Compute) and how to connect them (Networking). Now, it is time to tackle the third and perhaps most critical pillar of the cloud: Storage.
In the physical world, if a hard drive fails, you lose data. In the virtual world, we have technologies like ZFSAdvanced file system with volume management, CephDistributed storage system for object/block/file storage, and Snapshots that make data resilient, portable, and practically immortal. This week, we move beyond simply "installing to a disk" and learn how the Linux kernel manages block devices, volume groups, and file systems.
What You'll Learn This Week
- Linux Storage primitives: Partitions, Block Devices, and Filesystems.
- Volume Management: Using LVMLogical Volume Manager - Flexible disk management to resize disks on the fly.
- Next-Gen Storage: Using ZFSAdvanced file system with volume management for instant snapshots and self-healing data.
- ProxmoxOpen-source virtualization platform combining KVM and LXC and LXCLinux Containers - OS-level virtualization and LXCLinux Containers - OS-level virtualization platform combining KVMKernel-based Virtual Machine - A Type 1 hypervisor HypervisorSoftware that creates and manages virtual machines" Type 1 hypervisorA bare-metal hypervisor that runs directly on hardware and LXCLinux Containers - OS-level virtualization Storage Architecture: How PVE manages Local vs. Shared storage.
- Disaster Recovery: The critical difference between a SnapshotPoint-in-time copy of VM state for rollback - and a Backup.
Part 1: Linux Storage Fundamentals (Revision + New Concepts)
In Operating Systems 2, you gained hands-on experience with Linux storage management, including fdisk for partitioning, LVMLogical Volume Manager - Flexible disk management for flexible volume management, and NFSNetwork File System - Remote file access protocol for network-attached storage. This section begins with a focused revision of those familiar tools—reconnecting you with concepts like /dev/sda, lsblk, and pvcreate—but now viewed through the lens of virtualizationThe creation of virtual versions of physical computing resources. However, we will also introduce an entirely new storage technology: ZFSAdvanced file system with volume management, the enterprise-grade filesystem that ProxmoxOpen-source virtualization platform combining KVM and LXC and LXCLinux Containers - OS-level virtualization and LXCLinux Containers - OS-level virtualization platform combining KVMKernel-based Virtual Machine - A Type 1 hypervisor HypervisorSoftware that creates and manages virtual machines" Type 1 hypervisorA bare-metal hypervisor that runs directly on hardware and LXCLinux Containers - OS-level virtualization relies on for advanced features like instant snapshots, self-healing data, and copy-on-write operations. While LVMLogical Volume Manager - Flexible disk management and fdisk are foundational, ZFSAdvanced file system with volume management represents the cutting edge of storage management in modern virtualized environments. Think of this as bridging your foundational Linux knowledge with enterprise virtualizationThe creation of virtual versions of physical computing resources infrastructure.
1. Block Devices and Partitions
In Linux, everything is a file. A hard drive is a special type of file called a Block Device. The Linux kernel assigns specific naming conventions to different storage technologies. Traditional SATA and SCSI drives are represented as /dev/sda, /dev/sdb, and so on, while modern NVMe drives use a different naming scheme such as /dev/nvme0n1 and /dev/nvme1n1. When a disk is divided into sections, these divisions are called partitions, and they are accessed through additional numerical suffixes—for example, /dev/sda1 refers to the first partition on the first drive.
1.1 Examining Storage
One of the most powerful diagnostic tools in Linux is lsblk (List Block Devices), which provides a visual tree representation of all connected storage devices. When you run this command in the terminal, it displays a hierarchical view that shows physical disks, their partitions, and any logical volumes built on top of them. This tree structure makes it immediately clear which partitions belong to which disks and how storage is organized across the system. For administrators managing Proxmox platform combining KVM - Type 1 hypervisor and LXC servers, lsblk is indispensable for quickly understanding storage topology without needing to parse complex configuration files. The diagram below illustrates how Linux represents different types of storage devices and their partition schemes:
Figure 1: Linux Block Devices and Partitions - How the kernel represents different storage types (SATA, NVMe, and their partitions)
# Tree view of all disks
lsblk
# Detailed view with UUIDs and Filesystem Types
lsblk -f
1.2 Managing Partitions (fdisk)
To create a formatted space on a disk, we use fdisk or parted.
# Open the utility for disk /dev/sdb
sudo fdisk /dev/sdb
# Commands inside tool:
# 'n' -> New Partition
# 'w' -> Write Changes
Linux treats disks as Block Devices (e.g., /dev/sda). Use lsblk to visualize the storage hierarchy and fdisk/parted to manage partitions. The host OS must recognize the disk before Proxmox can utilize it.
Reflection: Consider why NVMe drives have names like nvme0n1 instead of sda. What happens if you attempt to partition a disk that is already mounted and actively in use?
Resources:
2. Logical Volume Manager (LVM)
Traditional partitions are rigid. If /dev/sda1 is 10GB and fills up, you cannot easily "grow" it if /dev/sda2 is right next to it. LVM abstracts physical disks into a flexible pool of storage.
2.1 The LVM Hierarchy
The three-tier architecture of LVM provides the flexibility that traditional partitions lack. As shown in the diagram below, the hierarchy flows from physical disks to virtual volumes:
Figure 2: LVM Three-Tier Hierarchy - Physical Volumes (PV) combine into Volume Groups (VG), which are divided into Logical Volumes (LV)
The hierarchy consists of three layers. At the foundation is the Physical Volume (PV), which represents the actual disk or partition (for example, /dev/sdb). These physical volumes are then combined into a Volume Group (VG), which acts as a unified storage pool—for instance, a data_pool might aggregate multiple drives to provide 500GB of total capacity. Finally, Logical Volumes (LV) are carved out from the volume group and allocated for specific uses, such as vm-100-disk for a virtual machine.
2.2 Hands-On LVM Commands
Proxmox platform combining KVM - Type 1 hypervisor and LXC uses LVM extensively. Here is how you manage it manually.
# 1. Initialize a disk for LVM
sudo pvcreate /dev/sdb
# 2. Create a Volume Group named 'data_vg'
sudo vgcreate data_vg /dev/sdb
# 3. Create a 10GB Logical Volume
sudo lvcreate -n my_volume -L 10G data_vg
# 4. View your work
sudo vgs # View Groups
sudo lvs # View Volumes
LVM adds a flexible abstraction layer using Physical Volumes (PV), Volume Groups (VG), and Logical Volumes (LV). This architecture enables dynamic resizing and pooling of disparate physical disks.
Reflection: Can you shrink an LVM volume while it is online and actively in use? What is the difference between standard LVM and LVM-Thin provisioning?
Resources:
3. ZFS: The Enterprise Standard (New Material)
ZFS (Zettabyte File System) is vastly superior to standard hardware RAID. It manages the physical disks directly and provides checksumming, compression, and deduplication.
3.1 Why ZFS?
Copy-on-Write (CoW) is one of ZFS's foundational design principles. When you edit a file, ZFS does not overwrite the old data in place. Instead, it writes the new data to a fresh block on the disk and then updates the pointer to reference the new location. The benefit of this approach is profound: if power fails during a write operation, the old data remains valid and intact. There is no corruption because the original block is never destroyed until the write is confirmed to be successful.
The illustration below compares traditional write operations (which overwrite data in place) versus ZFS's Copy-on-Write approach:
Figure 3: ZFS Copy-on-Write (CoW) - Traditional filesystems overwrite data in place; ZFS writes to new blocks and updates pointers
Self-Healing is another critical feature of ZFS. The filesystem stores a cryptographic checksum (a digital fingerprint) for every block of data. If a cosmic ray flips a bit on your drive—an event known as bit rot—ZFS detects the mismatch between the data and its checksum. If redundancy exists (such as in a mirrored or RAID-Z configuration), ZFS automatically repairs the corrupted block by restoring it from a valid copy.
The self-healing process is visualized below, showing how ZFS detects, validates, and repairs corrupted data blocks:
Figure 4: ZFS Self-Healing - Checksums detect corrupted blocks, which are automatically repaired from redundant copies
3.2 Basic ZFS Commands
Proxmox platform combining KVM - Type 1 hypervisor and LXC installs ZFS tools by default.
# 1. Check the health of your storage pool
sudo zpool status
# 2. List all datasets (file systems)
sudo zfs list
# 3. Create a new dataset for ISOs
sudo zfs create rpool/isos
3.3 The Power of Instant Snapshots
The Copy-on-Write architecture unlocks one of ZFS's most remarkable capabilities: instantaneous snapshots. Unlike traditional backup systems that must copy gigabytes or terabytes of data (a process that can take hours), a ZFS snapshot - is merely a metadata operation—a lightweight bookmark that marks the current state of the filesystem. When you create a snapshot - , ZFS doesn't duplicate any data blocks; it simply freezes a reference point in time. The snapshot - consumes zero disk space initially because it shares all its data blocks with the current filesystem. Only when data begins to change does the snapshot - start consuming space, as ZFS preserves the old blocks that the snapshot - references while writing new data to fresh locations. This means you can take a snapshot - of a 1TB virtual machine in less than a second, with no performance impact and no initial storage overhead.
# 1. Take a snapshot - of a dataset
sudo zfs snapshot rpool/data/vm-100-disk-0@before_update
# 2. Rollback (Undo changes)
sudo zfs rollback rpool/data/vm-100-disk-0@before_update
ZFS combines RAID and Volume Management with Copy-on-Write logic. Its self-healing checksums prevent bit-rot, and its architecture enables near-instant snapshots without initial storage overhead.
Reflection: Why does ZFS need direct access to the disk (passthrough) rather than working through a traditional RAID controller? What is the "ARC" in ZFS terms, and how does it improve performance?
Resources:
4. Virtual Disk - Formats
When you create a VM - , its hard drive is just a file on the host.
4.1 Raw (.raw)
The Raw disk format is effectively a bit-for-bit representation of a hard drive without any additional metadata or container structure. Because it lacks a translation layer, the file is read and written directly to the underlying block device, making it the most performant option available. However, this simplicity comes at a cost; creating a 100GB Raw disk immediately consumes 100GB of physical space (unless sparse provisioning is strictly enforced), and it does not support advanced features like internal snapshots. If you require snapshot - capabilities with Raw disks, you must rely on the underlying storage system, such as LVM-Thin or ZFS, to handle them.
4.2 QCOW2 - Type 1 hypervisor for virtualization Copy-On-Write disk image format (QEMU - Type 1 hypervisor for virtualization Copy On Write)
QCOW2 - Type 1 hypervisor for virtualization Copy-On-Write disk image format is a functional, feature-rich format designed specifically for the QEMU - Type 1 hypervisor for virtualization emulator. Unlike Raw, it acts as an intelligent container that creates a layer of abstraction between the VM - and the physical disk. This allows for powerful features such as internal snapshots, transparent compression, and encryption directly within the file itself. While this abstraction layer introduces a minor performance overhead compared to Raw, the flexibility it offers—particularly the ability to grow the disk file dynamically as data is added—makes it the standard choice for file-based storage backends like NFS or local directories.
4.3 Summary Comparison
The visual comparison below highlights the key differences between Raw and QCOW2 - Type 1 hypervisor for virtualization Copy-On-Write disk image format disk formats:
Figure 5: Virtual Disk - Formats - Raw disks offer maximum performance while QCOW2 - Type 1 hypervisor for virtualization Copy-On-Write disk image format provides flexibility with snapshots and thin provisioning
| Feature | Raw (.raw) |
QCOW2 - Type 1 hypervisor for virtualization Copy-On-Write disk image format |
|---|---|---|
| Performance | Highest (Near Native) | High (Slight Overhead) |
| Space Usage | Fixed (Pre-allocated) | Dynamic (Grow on demand) |
| Snapshots | Requires ZFS/LVM support | Built-in (Internal) |
| Portability | Universal (Byte stream) | QEMU - Type 1 hypervisor for virtualization Specific |
Raw disks offer near-native performance but lack flexibility. QCOW2 provides thin provisioning and internal snapshots. The choice depends on the underlying storage backend (LVM/ZFS vs. NFS/File).
Reflection:
- Why can't you take an internal snapshot - on a Raw disk?
- How does "sparse provisioning" differ from "thin provisioning"?
Resources:
8. Additional Resources
- Proxmox platform combining KVM - Type 1 hypervisor and LXC Storage Wiki: Official Docs
- ZFS for Dummies: ArsTechnica Guide
- LVM Cheat Sheet: Red Hat LVM
9. Lab Exercises
- Lab 1: Storage
- Goal: Introduction to Local Storage and LVM.
- Lab 2: Shared Storage
- Goal: Connecting to NFS and iSCSI targets.
Test Your Knowledge
Ready to check your understanding of this week's material? Take the interactive quiz now!
Start Quiz